feat(pypi): support importing uv.lock file#3785
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds support for uv.lock files in rules_python, introducing a toml2json conversion utility and updating the lock rule and requirement parsing logic. The review identifies critical compatibility issues, specifically the toml2json tool's reliance on Python 3.11's tomllib and missing serialization for date/time objects. Furthermore, the feedback highlights logic errors in how package extras are handled—which could lead to dependency bloat or failed consistency checks—and suggests improvements for platform resolution and path handling in shell scripts.
| """Parse requirements using uv.lock as the primary source.""" | ||
| ret = _parse_uv_lock_json( | ||
| uv_lock_json = uv_lock_json, | ||
| all_platforms = _get_all_platforms(requirements_by_platform) if requirements_by_platform else [], |
There was a problem hiding this comment.
If requirements_by_platform is empty (which is common when using uv.lock as the primary source), all_platforms will be an empty list. This results in all packages having empty target_platforms, which will likely cause issues in downstream rules that expect platform information for wheel selection.
| all_platforms = _get_all_platforms(requirements_by_platform) if requirements_by_platform else [], | |
| all_platforms = _get_all_platforms(requirements_by_platform) if requirements_by_platform else sorted(platforms.keys()), |
| for extra in pkg.get("provides-extras", pkg.get("extras", [])): | ||
| if extra not in entry["extras"]: | ||
| entry["extras"][extra] = None |
There was a problem hiding this comment.
Including all provides-extras in the requirement line for every package is likely incorrect. provides-extras lists all extras a package defines, not necessarily what was resolved or requested. Including all of them will force the installation of all optional dependencies for every package in the lock file, leading to significant dependency bloat. It might be better to omit extras from the requirement line if the lock file already provides the specific version and URL, or only include the extras that were part of the resolution.
| for dep in pkg.get("dependencies", []): | ||
| extra = dep.get("extra") | ||
| if extra and extra not in uv_packages[name]["extras"]: | ||
| uv_packages[name]["extras"][extra] = None |
There was a problem hiding this comment.
The logic for collecting extras in the consistency check appears to be flawed. It iterates over the dependencies of pkg and adds the extra name to uv_packages[name]["extras"], where name is the name of the requiring package. Later, when checking a package item, it looks for extras in uv_packages[item.name]["extras"]. This will fail to find extras that were requested by other packages.
| import datetime | ||
| import json | ||
| import sys | ||
| import tomllib |
There was a problem hiding this comment.
| python_run = getattr(python, "short_path", python) | ||
|
|
||
| # Make python path absolute since uv changes directory via --directory | ||
| python_abs = "$PWD/" + python_path |
There was a problem hiding this comment.
Prepending $PWD/ to python_path assumes that the path is relative to the execution root. If python_path is already an absolute path, this will result in an invalid path. It is safer to check if the path is absolute before prepending.
| python_abs = "$PWD/" + python_path | |
| python_abs = python_path if python_path.startswith("/") else "$PWD/" + python_path |
| def json_serializer(obj): | ||
| if isinstance(obj, datetime.datetime): | ||
| return obj.isoformat() | ||
| raise TypeError(f"Object of type {type(obj).__name__} is not JSON serializable") |
There was a problem hiding this comment.
The json_serializer only handles datetime.datetime. However, tomllib can also return datetime.date and datetime.time objects for TOML date and time literals. These should also be serialized to ISO format to avoid TypeError.
| def json_serializer(obj): | |
| if isinstance(obj, datetime.datetime): | |
| return obj.isoformat() | |
| raise TypeError(f"Object of type {type(obj).__name__} is not JSON serializable") | |
| def json_serializer(obj): | |
| if isinstance(obj, (datetime.datetime, datetime.date, datetime.time)): | |
| return obj.isoformat() | |
| raise TypeError(f"Object of type {type(obj).__name__} is not JSON serializable") |
- tomllib: try/except fallback to tomli for Python <3.11 - json_serializer: add datetime.date and datetime.time support - all_platforms: use sorted(platforms.keys()) fallback - $PWD/ path: check if python_path is absolute - extra_pip_args: pass through to _parse_uv_lock_json - Add uv_lock tests: multiple packages, extra_pip_args, multi-os - Update plan.md with review cycle instructions
|
Addressed all review comments. Please re-review when you get a chance. @gemini-code-assist |
There was a problem hiding this comment.
Code Review
This pull request introduces support for uv.lock files in rules_python, adding a new lock_format attribute to the lock rule and a toml2json utility for parsing. It also implements a consistency check between uv.lock and requirements files. The reviewer identified a potential runtime error in the uv.lock parser due to unsafe dictionary access for optional hash keys and suggested simplifying redundant logic during hash extraction.
|
|
||
| seen = {} | ||
| for wheel in pkg.get("wheels", []): | ||
| sha256 = wheel["hash"].replace("sha256:", "") |
There was a problem hiding this comment.
The hash key is optional for wheels in the uv.lock file format. Using direct dictionary access wheel["hash"] will cause a runtime error if the key is missing. You should use .get() for safe access, similar to how it's used for sdist hashes.
| sha256 = wheel["hash"].replace("sha256:", "") | |
| sha256 = wheel.get("hash", "").replace("sha256:", "") |
|
|
||
| sdist = pkg.get("sdist") | ||
| if sdist: | ||
| sha256 = sdist.get("hash", "").replace("sha256:", "") if sdist.get("hash") else "" |
There was a problem hiding this comment.
This line can be simplified. The if sdist.get("hash") else "" is redundant. sdist.get("hash", "").replace("sha256:", "") correctly handles cases where the hash is missing or empty.
| sha256 = sdist.get("hash", "").replace("sha256:", "") if sdist.get("hash") else "" | |
| sha256 = sdist.get("hash", "").replace("sha256:", "") |
- Add python_3_14 toolchain in MODULE.bazel - Always use Python 3.14 interpreter for uv.lock JSON conversion - Remove tomli fallback from toml2json.py (Python 3.14 has tomllib) - Simplify convert_uv_lock_to_json to use fixed interpreter
- Add bazel-in-bazel integration test under tests/integration/uv_lock/ - Test converts uv.lock TOML to JSON using toml2json tool - Remove dead uv_pypi test data (placeholder only) - Restore tomli fallback in toml2json.py for backward compatibility
|
Addressed all review comments and added Python 3.14 toolchain + integration test. Please re-review. @gemini-code-assist |
There was a problem hiding this comment.
Code Review
This pull request introduces support for uv.lock files as a primary metadata source in rules_python. Key additions include a toml2json tool for parsing, a new uv_lock format for the lock rule, and consistency checks within parse_requirements. Reviewer feedback suggests adopting more idiomatic Starlark dictionary access, implementing safer hash prefix stripping, and improving the portability of shell commands by replacing hardcoded environment variables with Bazel-provided paths.
| uv_packages = {} | ||
| for pkg in packages: | ||
| name = pkg["name"] | ||
| if (name == "rules_python" and is_rules_python_root) or (pkg.get("source") or {}).get("virtual"): |
There was a problem hiding this comment.
Using pkg.get("source", {}) is more idiomatic in Starlark and Python than (pkg.get("source") or {}) when you want to provide a default dictionary for a missing or None key.
| if (name == "rules_python" and is_rules_python_root) or (pkg.get("source") or {}).get("virtual"): | |
| if (name == "rules_python" and is_rules_python_root) or pkg.get("source", {}).get("virtual"): |
|
|
||
| seen = {} | ||
| for wheel in pkg.get("wheels", []): | ||
| sha256 = wheel["hash"].replace("sha256:", "") |
There was a problem hiding this comment.
It is safer to check if the hash string actually starts with the sha256: prefix before attempting to replace it, or use a more robust way to strip the algorithm prefix if uv ever supports other hash types in the lock file.
| sha256 = wheel["hash"].replace("sha256:", "") | |
| sha256 = wheel["hash"][7:] if wheel["hash"].startswith("sha256:") else wheel["hash"] |
|
|
||
| sdist = pkg.get("sdist") | ||
| if sdist: | ||
| sha256 = sdist.get("hash", "").replace("sha256:", "") if sdist.get("hash") else "" |
There was a problem hiding this comment.
| filename = filename, | ||
| yanked = None, | ||
| )) | ||
| elif (pkg.get("source") or {}).get("git"): |
| python_run = getattr(python, "short_path", python) | ||
|
|
||
| # Make python path absolute since uv changes directory via --directory | ||
| python_abs = python_path if python_path.startswith("/") else "$PWD/" + python_path |
There was a problem hiding this comment.
Hardcoding $PWD/ for absolute path construction in a shell command might be problematic on some platforms or environments. Consider using ctx.bin_dir.path or similar Bazel-provided paths if possible, or ensure this is only executed in a bash-compatible environment where PWD is guaranteed to be set correctly.
|
The latest release of toml.bzl makes bazel_lib a dev dependency. It now uses skylib for the |
This is vibe coded, but I thought that the approach might have been rigorous
enough to submit a PR.
The strategy was:
uv.lockfile from thelockrule.uv.lockfile together with the requirements fileand verify things are OK.
Extra things that we could do:
upstream patches to remove the
bazel_libdependency (which I think couldbecome a
dev_dependency). This way we won't transitively depend onrules_goand we won't need to download the interpreter to read theuv.lockfile.uv.lockscenarious and ensure parity withrequirements.txtfiles.PyPIindex to understand if the packages are yanked or not - lockfile does not have that information.
pyproject.tomlfile to get the index values for each package.Summary:
Closes #3557
Work towards #2787